Efficient Sampling Startup for Uniprocessor and Simultaneous Multithreading Simulation
نویسندگان
چکیده
Modern architecture research relies heavily on detailed pipeline simulation. Simulating the full execution of an industry standard benchmark can take weeks to months. Statistical sampling and techniques like SimPoint that pick small sets of execution samples have been shown to provide accurate results while significantly reducing simulation time. The inefficiencies in sampling are (a) needing the correct memory image to execute the sample, and (b) needing to having warm architecture state when simulating the sample. In this paper we examine efficient Sampling Startup techniques for representing the correct memory image during simulation, and for dealing with warmup. Representing the correct memory image ensures the memory values consumed during the sample’s simulation are correct. Warmup techniques focus on reducing error due to the architecture state not being fully representative of the complete execution that proceeds the sample to be simulated. This paper presents several Sampling Startup techniques and compares them against previously proposed techniques for both uniprocessor and simultaneous multithreading architecture simulation.
منابع مشابه
Asymmetric Multiprocessing for Simultaneous Multithreading Processors
Simultaneous Multithreading (SMT) has become common in commercially available processors with hardware support for dual contexts of execution. However, performance of SMT systems has been disappointing for many applications. Consequently, many SMT systems are operated in a single-context configuration to achieve better average throughput, depending on the application domain. This paper first ex...
متن کاملMemory Subsystem Design for Multithreaded Processors
Multithreading processors pose new challenges and new opportunities for cache/memory hierarchy design. Multithreading significantly alters the data reference stream seen by the memory subsystem. Multithreading also demands very different performance characteristics from the cache hierarchy than a typical (uniprocessor) CPU. This paper is specifically concerned with memory hierarchy design consi...
متن کاملThread Scheduling For Shared Caches ECE 742 Final Project Report
Simultaneous multithreading (SMT) processors and chip multiprocessors (CMP) with shared caches usually require a primary cache increase by a factor proportional to the number of execution contexts to retain the cache performance of the uniprocessor. In this paper we study depth-first task scheduling, which was recently shown to reduce the number of cache misses when a single multithreaded appli...
متن کاملChip Multiprocessors – A Cost-effective Alternative to Simultaneous Multithreading
In this paper we describe the principles of the chip multiprocessor architecture, overview design alternatives and present some example processors of this type. We discuss the results of several simulations where chip multiprocessor was compared to other advanced processor architectures including superscalars and simultaneous multithreading processors. Although simultaneous multithreading seems...
متن کاملEfficient Sampling Startup for Sampled Processor Simulation
Modern architecture research relies heavily on detailed pipeline simulation. Simulating the full execution of an industry standard benchmark can take weeks to months. Statistical sampling and sample techniques like SimPoint that pick small sets of execution samples have been shown to provide accurate results while significantly reducing simulation time. The inefficiencies in sampling are (a) ne...
متن کامل